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Intra-tumor heterogeneity is a hallmark of many cancers and may lead to therapy resistance or interfere with person- 
alized treatment strategies. Here, we combined topographic mapping of somatic breakpoints and transcriptional profiling 
to probe intra-tumor heterogeneity of treatment-naive stage IIIC/IV epithelial ovarian cancer. We observed that 
most substantial differences in genomic rearrangement landscapes occurred between metastases in the omentum and 
peritoneum versus tumor sites in the ovaries. Several cancer genes such as NF1, CDKN2A, and FANCD2 were affected by 
lesion-specific breakpoints. Furthermore, the intra-tumor variability involved different mutational hallmarks including 
lesion-specific kataegis [local mutation shower coinciding with genomic breakpoints), rearrangement classes, and coding 
mutations. In one extreme case, we identified two independent TP53 mutations in ovary tumors and omentum /peritoneum 
metastases, respectively. Examination of gene expression dynamics revealed up-regulation of key cancer pathways in- 
cluding WNT, integrin, chemokine, and Hedgehog signaling in only subsets of tumor samples from the same patient. 
Finally, we took advantage of the multilevel tumor analysis to understand the effects of genomic breakpoints on qualitative 
and quantitative gene expression changes. We show that intra-tumor gene expression differences are caused by site-specific 
genomic alterations, including formation of in-frame fusion genes. These data highlight the plasticity of ovarian cancer 
genomes, which may contribute to their strong capacity to adapt to changing environmental conditions and give rise to the 
high rate of recurrent disease following standard treatment regimes. 

[Supplemental material is available for this article.] 



In recent years, tremendous progress has been made in the un- 
derstanding of the complexity of the cancer genome (Stratton 
2011). Studies including large numbers of patients per tumor type 
have identified recurrent mutations, copy number variants, epi- 
genetic changes, and genomic rearrangements specific for certain 
cancer types (http://cancergenome.nih.gov/; The International 
Cancer Genome Consortium 2010; The Cancer Genome Atlas Re- 
search Network 2011). 

Although more than 400 commonly mutated cancer genes 
have been identified (Futreal et al. 2004; Santarius et al. 2010), 
extensive genetic heterogeneity has been noticed across dif- 
ferent cancer types and also within individual tumors (Stratton 
2011; Yates and Campbell 2012). Intra-tumor heterogeneity is 
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a result of the action of the evolutionary forces of mutation and 
selection (Stratton 2011; Yates and Campbell 2012). The tra- 
ditional linear model of cancer evolution describes multiple, 
successive cycles of mutations and selection leading to malig- 
nant tumor cells, ultimately leading to metastases (Hanahan 
and Weinberg 2000; Klein 2009; Yates and Campbell 2012). In 
contrast, parallel evolution describes dissemination of tumor 
cells from the primary tumor as a continuous process occurring 
from very early on in tumor development. These disseminated 
cells may continue to evolve independent of the primary tu- 
mor, causing the formation of metastases genetically relatively 
distinct from the primary tumor and other metastases (Gray 
2003; Klein 2009). Several studies have focused on spatial 
sampling of various cancer types to gain insight into the extent 
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and complexity of tumor evolution (Campbell et al. 2010; Yachida 
et al. 2010; Gerlinger et al. 2012; Bashashati et al. 2013). 

Here we studied intra-tumor heterogeneity in epithelial 
ovarian cancer. With an annual worldwide incidence of 220,000 
and mortality of 140,000, epithelial ovarian cancer is the leading 
cause of death among women with gynecological malignancies 
and a disease in urgent need for improved treatment (Ferlay et al. 
2010). Large-scale genomic analysis of ovarian cancer patients has 
uncovered only a few recurrently mutated genes, such as TP53 and 
mutations in BRCA1/BRCA2 (The Cancer Genome Atlas Research 
Network 2011). Ovarian cancers show a relatively high number 
of copy number variations and structural variations (SVs) (The 
Cancer Genome Atlas Research Network 2011; Malek et al. 2011; 
McBride et al. 2012). This may be explained by the high incidence 
of deregulation of genes in the homologous recombination path- 
way (BRCA1/BRCA2), which has provided opportunities for suc- 
cessful treatment with PARP inhibitors (Banerjee et al. 2010; 
McBride et al. 2012). Expression profiling has been instrumental 
to classify ovarian cancers and revealed molecular subtypes with 
prognostic relevance (Tothill et al. 2008; Verhaak et al. 2013). 
Despite these advances in understanding of ovarian cancer bi- 
ology, the cure rate has not much improved (Ledermann and 
Kristeleit 2010; Vaughan et al. 2011). 

We set out to understand the intra-tumor dynamics of treat- 
ment-naive epithelial ovarian cancer by high-resolution analysis 
of genomic rearrangements. Because the effects of genomic rear- 
rangements in tumor development are only poorly understood, we 
also examined the contribution of genomic rearrangements to 
intra-tumor differences in gene expression. We found that treat- 
ment-naive epithelial ovarian cancers exhibit remarkably diverse 
patterns of genomic rearrangements, which in turn lead to intra- 
tumor changes in gene expression, including up-regulation of 
major cancer pathways in only subsets of samples from a single 
patient. These findings provide novel insight in potential mecha- 
nisms underlying treatment resistance. 

Results 

Topographic sampling of treatment-naive epithelial 
ovarian cancer 

Ovarian cancer is often discovered when the disease is already in 
an advanced stage, resulting in the presence of a unique metastasis 
pattern with cancer cells exfoliating throughout the abdominal 
cavity following the peritoneal fluid circulation route. The tumor 
mass is often large with metastases spread throughout the abdo- 
men. Standard of care for such advanced ovarian cancer patients 



involves surgical cytoreduction before starting chemotherapy 
treatment. We obtained comprehensive tumor and whole blood 
samples from three treatment-naive advanced epithelial ovarian 
cancer patients with high tumor loads (Table 1). Patients 1 and 3 
were diagnosed with a serous adenocarcinoma, whereas patient 2 
was diagnosed with a carcinosarcoma, which is a less frequently 
observed (<l%-4%) form of epithelial ovarian cancer (Rauh-Hain 
et al. 2013). Carcinosarcoma is characterized by the mixed histol- 
ogy of carcinomatous and sarcomatous components with a more 
aggressive behavior and a poorer prognosis when compared with 
serous adenocarcinomas (Supplemental Fig. 1; Harris et al. 2003). 

For each patient, tumor biopsies were obtained during surgery 
from physically separated tumor sites in the abdomen with the 
final goal to obtain a representative set of samples (Supplemental 
Table 1; Fig. 1A). The tumor content of each sampling site was 
generally well above 50% based on histopathological measure- 
ments, although computational measurements by ASCAT in- 
dicated lower percentages (Supplemental Table 1; Supplemental 
Fig. 1; Van Loo et al. 2010). Particularly, the metastatic tumor bi- 
opsies from patient 1 (pl.IV-1, pl.IV-2, and pl.V-1) and the right 
ovary tumor sample from patient 3 (p3.III) have a relatively low 
tumor content. However, these samples were included in most of 
our analysis, because we could compensate for the lower tumor 
content by deep sequencing of identified genomic changes. A total 
of 34 samples (27 tumor, seven reference samples) were obtained 
and used for the analyses outlined below (Supplemental Table 2). 

Heterogeneity of structural and copy number variation 
in treatment-naive epithelial ovarian cancer 

Ovarian cancer is notorious for its frequent genomic instability 
(The Cancer Genome Atlas Research Network 2011; McBride et al. 
2012). Whole-genome mate-pair sequencing allows direct detec- 
tion of genomic rearrangement breakpoints based on discordantly 
oriented and spaced mate-pairs (read pairs) (Medvedev et al. 2009). 
We performed whole-genome mate-pair sequencing using an in- 
sert size of ~3 kb (Supplemental Fig. 2) for each of the biopsies in 
order to obtain a detailed and comprehensive representation of 
the genomic instability within tumor samples from three ovarian 
cancer patients. We used a breakpoint detection algorithm that 
simultaneously clusters discordant mate-pair sequencing reads 
from all tumor biopsies per patient (Kloosterman et al. 2011a), 
allowing us to genotype breakpoints that are present at low fre- 
quency with relatively high sensitivity, i.e., based on a single dis- 
cordant read pair once a robust call is made in another sample of 
the same patient. For example, given the median physical genomic 
coverage of ~50x, the data allow us to genotype a heterozygous 



Table 1. Clinical data of epithelial ovarian cancer patients included in this study 



Patient Age at time FIGO 

number of debulking Histopathology Debulking stage Post-operative clinical course 



53 



71 



77 



Moderate to poorly 
differentiated serous 
adenocarcinoma 

Carcinosarcoma 



Poorly differentiated serous 
adenocarcinoma 



Primary, optimal 



Primary, optimal 



IIIC 



IIIC 



Primary, incomplete IV 



Six cycles adjuvant combined intraperitoneal/ 
intravenous chemotherapy (cisplatin/paclitaxel). 
No recurrence until 24 mo after primary debulking. 

Six cycles adjuvant carboplatin monotherapy. 

Progressive disease during adjuvant chemotherapy. 
Patient died 1 1 mo after primary debulking. 

Three cycles neo-adjuvant carboplatin monotherapy 
followed by interval debulking. Three cycles adjuvant 
carboplatin monotherapy. Disease recurrence at 
12 mo after primary (incomplete) debulking. Patient 
died 18 mo after primary debulking. 
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Figure 1. Somatic genomic rearrangements detected in patient 1 (left), 2 (middle), and 3 (right). (A) Biopsy locations per patient. Ellipses indicate 
physically separated tumors; black dots represent biopsy locations. Ellipses are not indicative for tumor size. For patients 1 and 2, ellipses are colored 
according to the corresponding branch derived from the SV analysis (see panel Q. Patient 1 : ovaries (orange), om/per (blue). Patient 2: ovaries/pelvis 
(orange), om/per (blue). (Illustration © 201 0 Terese Winslow, U.S. Govt, has certain rights.) (B) Bar chart representing the distribution of the frequency of 
breakpoints per patient. (C) Heat map and clustering analysis of the detected somatic breakpoints per patient. Rows represent breakpoints, red and yellow 
bars indicate the presence (red) or absence (yellow) of the breakpoint in a sample. Om/per, omentum/peritoneum. (D) Distribution of somatic rear- 
rangement types per branch for patients 1 and 2 and for all patient 3 samples. 



breakpoint present in at least 5% and 14% of the tumor cells in 
samples with a tumor percentage of 90% and 30%, respectively 
(Supplemental Table 3). After stringent filtering and removal of SV 
calls present in matching normal control samples, we found between 



120 and 369 somatic genomic rearrangements across the primary and 
metastatic tumor samples in three patients (Supplemental Table 4). 
We used PCR to validate a set of breakpoint calls and could con- 
firm 95 out of 121 tested (>78% specificity) (Supplemental Fig. 3). 
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For each somatic breakpoint detected by mate-pair sequenc- 
ing, we determined the number of tumor samples that carried the 
breakpoint. For patient 1, this revealed that only 2/369 somatic 
breakpoints are shared between all samples, whereas the majority 
was found to be shared between only four or five of the eight 
samples from this patient. In patient 2 the largest number of 
breakpoints is also shared between only four samples, but 34/120 
breakpoints were shared between all nine tumor samples. In pa- 
tient 3 the vast majority of breakpoints were shared between five 
and seven of the seven tumor samples (Fig. IB). We then performed 
unsupervised hierarchical clustering using the breakpoint junc- 
tions detected across each of the biopsies per patient. For patients 
1 and 2, this revealed two clusters of samples. Particularly patient 1 
showed two extremely different branches. For both patients, one 
cluster contained all biopsies from the omentum and peritoneum, 
whereas the other cluster contained all biopsies from the ovaries, 
and for patient 2 also a biopsy from a tumor located in the pelvis. In 
contrast to the branching patterns detected in patient 1 and 2, 
a much more homogeneous pattern was detected in patient 3 
(Fig. 1C). Several breakpoints overlap with cancer genes from the 
Cancer Gene Census, including NF1, FANCD2, and CDKN2A and 
these are all targeted by breakpoints present in only subsets of 
samples (Supplemental Table 4; Futreal et al. 2004). For FANCD2 
and the cancer-related genes ERBB4 and ESR1, which are targeted 
by breakpoints in patient 2, we observed a sample-specific effect of 
the breakpoint on gene expression (Supplemental Fig. 4). Expres- 
sion of ESR1 is a prognostic factor for survival in ovarian cancer 
(Zamagni et al. 2009). 

Distinct patterns of genomic rearrangement classes are ob- 
served among different cancers (Stephens et al. 2009; Campbell 
et al. 2010; McBride et al. 2012). Analysis of breakpoint types per 
patient revealed that deletions comprised the largest subset (40%) 
in patient 1, which is consistent with recent findings indicating an 
excess of deletions in ovarian cancer with germline BRCA muta- 
tions (McBride et al. 2012), as was the case for this patient (see 
below). However, more detailed analysis of branch-specific break- 
points for patient 1 revealed a shift in rearrangement types be- 
tween the two branches despite their shared BRCA status (Fig. ID). 
The cluster of four omental/peritoneal metastatic tumors showed 
a higher percentage of somatic tandem duplication type rear- 
rangements and inversions, and a lower percentage of deletions 
and interchromosomal rearrangements when compared with the 
cluster containing the tumors on the ovaries. In line with this, we 
observed an increase in head-to-tail breakpoint junctions at a cost 
of tail-to-head junctions for patient 1 in the omental/peritoneal 
samples when compared with the samples that originated from 
both ovaries (data not shown). A difference in rearrangement types 
was, however, not apparent for the branches in patient 2. For pa- 
tient 3 we observed that the majority (>40%) of somatic SVs com- 
prise tandem duplications, in line with previous studies (Fig. ID; 
McBride et al. 2012). Inter-sample differences in rearrangement 
signatures were not detected for patient 3. 

To get further support for the dynamic patterns of heteroge- 
neity revealed by the somatic genomic breakpoints in the ovarian 
cancers studied here, we used SNP-array genotyping and copy 
number analysis. We performed unsupervised hierarchical clus- 
tering of allele frequencies derived from the SNP-array genotyping 
data. The analysis includes all SNPs with differences in allele fre- 
quencies across each of the biopsies per patient (-10-30K SNPs) 
(Methods). Similar patterns of deviating allele frequencies for 
specific subsets of samples indicate shared ancestry, whereas di- 
versity of these patterns rather suggests independent evolution. 



Heat map and clustering analysis of the allele frequencies of in- 
cluded SNPs confirmed the results from the mate-pair analysis for 
each patient (Supplemental Fig. 5). 

Intra-tumor mutational profiles in ovarian cancer 

Next, we screened coding sequences of a total of 2099 cancer genes 
across each of the biopsies (Supplemental Table 3). We validated all 
identified mutations on all tumor and matching normal tissue 
samples using PCR-based resequencing on the MiSeq at >1000x 
coverage. The MiSeq data were also used to derive or refine muta- 
tion frequencies (Supplemental Table 5). We detected 63 somatic 
single-nucleotide mutations in patient 1, and considerably fewer 
mutations in patients 2 and 3 (17 mutations per patient, Fig. 2A). 
In patient 1, we also identified a BRCA2 germline frameshift indel 
and concomitant LOH of chr 13 in the tumor biopsies. 

All patients carried TPS 3 mutations. In patient 2 and 3 a single 
TP53 mutation was detected in all tumor samples per patient. In- 
terestingly, we identified two different driver TP53 missense mu- 
tations (P278L and I195N) in patient 1, occurring at distinct tumor 
locations. Both of these mutations have been described in the 
COSMIC database (Forbes et al. 2010). Only the samples derived 
from the right ovary (pl.I-1 and pi. 1-2, the presumed primary 
tumor), contained both TP53 mutations, albeit I195N was detected 
at low frequency (l%-9% vs. 33%-77% for P278L) (Fig. 2B). 

We observed 19 mutations that were unique to only a single 
ovary tumor sample in patient 1 (private mutations). However, 
none of the four metastases in the omentum and peritoneum 
(pl.IV-1 to pl.IV-3 and pl.V-1) carried private mutations. In fact, 
all mutations identified in the omentum and peritoneum cluster of 
samples were ubiquitous and, with the exception of a mutation in 
DLL1, all variants were also identified in the samples at the right 
ovary. For patient 2, 12 of the 1 7 mutations were in FANCD2 and all 
12 occurred in samples p2.VI-l and p2.VI-2 within a window of 
1.2 kb and comprising characteristic TpCpX trinucleotides, likely 
resulting from kataegis (Nik-Zainal et al. 2012). Similar to kataegis 
described in breast cancer, these mutations coincided with SV 
uniquely present in these two tumor samples (Fig. 2C). Most of the 
other coding variants identified in patient 2 were shared between 
all samples. The majority of mutations in patient 3 were ubiqui- 
tous. Two mutations occurred in the tumor suppressor gene TSC1 
(one missense, P141R, and one essential splice mutation). The 
TSC1 splice site mutation leads to truncation of the TSC1 tran- 
script, suggesting deregulation of mTOR signaling as a possible 
contributor to tumor development in patient 3 (Supplemental 
Fig. 6; Dobbin and Landen 2013). An additional mutation was 
found in CSMD3 in this patient, which is frequently mutated in 
ovarian cancer and non-small cell lung cancer, although the 
functional role of this gene in tumor formation is not clear (The 
Cancer Genome Atlas Research Network 2011; Liu et al. 2012). 
Only two private cancer gene single-nucleotide mutations were 
found in patient 3 (sample p3.IV). 

Overall, the single-nucleotide mutation data revealed a simi- 
lar pattern of genetic heterogeneity as the structural and copy 
number variation data for patient 1; one cluster of mutations oc- 
curred at metastatic tumor sites in the omentum and peritoneum 
and another cluster of mutations was found in the tumors in the 
left ovary. Both clusters shared mutations with the presumed pri- 
mary tumor samples (pl.I-1 and pi. 1-2). Interestingly, we found 
a difference in the Transition/Transversion (Ti/Tv) ratio for the two 
branches in this patient (Fig. 2D), suggesting that distinct muta- 
tional forces acted in different branches of the ovarian tumor in 
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Figure 2. Somatic single-nucleotide mutation analysis results for patients 1, 2, and 3. (A) Regional distribution of mutations across tumor samples per 
patient. Blue gradient indicates the percentage of reads that carried the mutation. Gene colors indicate mutation impact: high, essential splice site or 
frame-shift (orange); medium, nonsynonymous (yellow); or silent, intronic, 5' or 3' UTR (white). (6) Distribution of the two TP53 missense mutations 
detected in patient 1 (P278Land I195N) across all tumor samples of this patient. (C) Kataegis as detected in patient 2 samples p2.VI-1 and pi .VI-2. The 1 2 
single-nucleotide changes in FANCD2 coincide with a genomic breakpoint, which is solely detected in these samples. (D) Transitions versus transversions 
for patient 1 . All ovarian samples (primary tumor [pi .1-1 and pi .1-2] and metastases located in the other ovary [pi .11-1 and pi .11-2]) versus the omentum/ 
peritoneum metastases (pl.lV-1, pi. IV-2, pi .IV-3, and pi .V-1). 
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patient 1 (Alexandrov et al. 2013). For patients 2 and 3 we found 
much fewer mutations and several of these were present in all 
samples. A mutation in BBS4 further supported the branching 
pattern observed in patient 2. 

Gene expression differences across ovarian cancer biopsies 
reveal intra-tumor subtypes and branch-specific pathway 
activation 

Gene expression profiling of ovarian cancer enabled classification 
in distinct subtypes associated with differences in survival and 
therapy resistance (Tothill et al. 2008; Verhaak et al. 2013). We 
performed RNA sequencing to measure gene expression across 
each of the tumor biopsies and we detected between 1000 and 
1300 differentially expressed genes per sample compared with all 
other samples of the same patient (Supplemental Table 6). Hier- 
archical clustering of gene expression differences for each of the 
patients revealed two major branches for both patient 1 and pa- 
tient 2 (Supplemental Fig. 5). For patient 3 the clustering of dif- 
ferentially expressed genes across tumor biopsies did not reveal 
any distinct subgroups. The clustering of samples based on RNA 
expression differences further substantiated intra-tumor diversity 
as observed based on genomic breakpoints. 

We used the normalized coverage for 1500 genes that define 
six different epithelial ovarian cancer subtypes to classify each of 
the tumor biopsies from patients 1 to 3 (Tothill et al. 2008; The 
Cancer Genome Atlas Research Network 201 1). The tumor biopsies 
from patient 1 fall apart into distinct subtypes following the 
branching we observed based on clustering of genomic and tran- 
scriptomic data: The samples from the omentum and peritoneum 
clearly display the CI (high stromal) signature, overlapping with 
the C2 (high immune) signature as described before (Tothill et al. 
2008), whereas samples from the ovaries rather fall into the C4 
(low stromal response) category although some expression of 
genes in the high immune signature can also be observed (Fig. 3 A). 
The samples from patient 3 display the C2 and C4 gene expression 
signatures. Patient 2 samples explicitly show the C5 (mesenchy- 
mal) signature, as expected from the histological examination, 
which indicated a carcinosarcoma (Supplemental Fig. 1). The dis- 
tant metastases of patient 2 are different from the ovary and pelvis 
samples as they also show overlap with the C2 category. 

To determine whether single-nucleotide changes observed in 
the genome were also found in expressed transcripts we analyzed 
the frequency for each of the identified mutations among RNA 
sequencing reads for patient 1 (Fig. 3B). This analysis showed that 
some single-nucleotide changes present at the DNA level are 
not expressed. In addition, we also find that alleles with single- 
nucleotide variants detected at low frequencies at the DNA level 
are expressed at very high levels. For example, the shared TP 5 3 
I195N variant and the private BBS1 and MEN1 variants are ex- 
pressed at a much higher frequency in the RNA, suggesting that 
they are relevant for tumor growth. 

Based on the top 5% most significantly differentially expressed 
genes we used Cytoscape software to evaluate whether specific 
cellular pathways or processes have altered expression in any of the 
branches or samples (Shannon et al. 2003). All branches showed 
activation of pathways or processes related to cancer development 
compared with the total pool of non-tumor samples of all three 
patients, such as the different aspects of cell division (e.g., cell 
cycle checkpoints, DNA replication, chromosome segregation) 
and growth factor and p53 signaling (Supplemental Table 6). 
Specific pathway activation was observed for the two branches in 



patients 1 and 2 (Fig. 3C; Supplemental Table 6). The samples in 
the ovarian cluster from patient 1 expressed significantly higher 
levels of genes involved in ERBB signaling and post-translational 
protein modification compared with the samples in the omentum/ 
peritoneum cluster. Up-regulation of Hedgehog/ WNT/Cadherin 
genes was observed in the ovarian and pelvic samples of patient 2. 
Interestingly, samples from the omentum and the peritoneum in 
both patients 1 and 2 had many pathways commonly up-regulated 
compared with the other samples in these patients, including 
chemokine signaling and cytokine-cytokine receptor interactions, 
immune response, extracellular matrix organization, and integrin 
signaling. Finally, cell adhesion (CAM) was one of the processes 
enriched for in the ovary and pelvis samples in patients 1 and 2, 
although the exact genes up-regulated in the samples from these 
two patients differ. 

Genomic heterogeneity causes intra-tumor differences 
in gene expression 

The marked differences in gene expression observed across bi- 
opsies from the same patient prompted us to analyze the contri- 
bution of genomic rearrangements to these differences, an aspect 
of tumor biology which is poorly understood. Based on copy 
number profiling we identified 14 large copy number gains and 
losses (range: 0.97-28 Mb) that were only present in subsets of 
samples from patient 1 and patient 2 (Supplemental Table 7). For 
each of these copy number changes we determined the mean of 
the differences in log 2 ratios from SNP-arrays and compared these 
with the mean of the log 2 ratios derived from the RNA sequencing. 
We observed a strong correlation between copy number changes 
and gene expression changes based on pairwise comparisons 
(Fig. 4A), indicating that gene expression is strongly influenced 
by DNA copy number. However, on a genome-wide basis, only 
1.5%-1.8 % of the differentially expressed genes are within the 
boundaries of the large copy number changes for patients 1 and 2, 
respectively. Thus other factors, such as mutations, translocations, 
epigenetic changes, or secondary effects are likely contributing to 
the intra-tumor gene expression differences. 

To further study the effect of genomic breakpoints on gene 
expression, we utilized the precise breakpoint definition provided 
by mate-pair sequencing. We reasoned that an expression effect of 
a breakpoint should result in either a positive or negative change 
in gene expression in samples with the breakpoint, relative to 
samples without the breakpoint. To test this, we used the gene 
expression differences (log 2 ratios) derived from pairwise com- 
parisons based on each of the samples from patient 1 and catego- 
rized the comparisons in three bins: (1) Both samples have a 
breakpoint, (2) one sample has the breakpoint and the other does 
not have the breakpoint, and (3) both samples do not have 
a breakpoint. If genomic breakpoints have an effect on expression 
of the respective genes, we would anticipate an overall increase in 
fold changes in bin 2 versus bins 1 and 3. Indeed, we observed a 
significant increase of the variance of the distribution of fold 
changes in bin 2, indicating that on average, genomic rearrange- 
ment breakpoints affect gene expression both positively and neg- 
atively (Fig. 4B). 

To get a more precise picture of the effects of SVs on gene 
expression, we measured the normalized read counts for exons 
located before and after an SV breakpoint in a gene (Fig. 4C). A shift 
in the ratio of the read count before and after the breakpoint would 
be expected if the expression of the exons before and/or after a 
breakpoint has changed as a result of the breakpoint. For example, 
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Figure 3. (A) Heat map of the percentage concordance of each sample with the subtypes of ovarian cancer presented by Tothill and colleagues (Futreal 
et al. 2004; Tothill et al. 2008; Santarius et al. 201 0). CI , high stromal response; C2, high immune signature; C3, low malignant potential (LMP) signature; 
C4, low stromal response; C5, mesenchymal signature; C6, low grade endometrioid. (B) Allele frequencies of coding mutations in RNA and DNA for 
patient 1 . (C) Branch-specific expression differences of genes involved in major signaling pathways for patient 1 (top) and patient 2 (bottom). 



part of a gene could be up-regulated due to fusion with another 
partner gene or a decrease in expression could be expected if one 
half of a gene is deleted. Furthermore, measuring this ratio allows 
us to solely detect the effect of a breakpoint in the gene and exclude 



influences of other factors on gene expression (e.g., neighboring 
breakpoints, promoter methylation). Figure 4D shows a boxplot of 
the distribution of ratios for the genes that do not contain a 
breakpoint and the genes that do contain a breakpoint, indicating 
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Figure 4. Intra-tumor differences in gene expression resulting from genomic rearrangements in patient 1 . (A) Pairwise comparison of copy number 
changes and gene expression changes. (B) Boxplot showing log ratios derived from pairwise comparisons of patient 1 samples, categorized in three bins: 
(1) Both samples have a breakpoint, (2) one sample has the breakpoint and the other does not have the breakpoint, (3) both samples do not have 
a breakpoint. Statistical testing of differences in variance was performed using Levene's test. (C) Schematic representation of a method used to detect 
expression differences of exons before and after a breakpoint in a gene. Per gene, the ratio of the normalized exonic read count before and after the 
breakpoint was determined for each of the samples from patient 1 . Ratios were separated in two bins: one containing ratios derived from genes with 
a breakpoint and one containing ratios derived from genes without a breakpoint. (D) Boxplot of the distribution of ratios of the normalized exonic read 
count before and after a breakpoint for genes that contain a breakpoint (with bp) and genes that do not contain a breakpoint (no bp). The analysis was 
repeated by randomly assigning breakpoints to samples (randomized data set). Statistical testing was performed using a Mann-Whitney U-test. (£) 
Changes in gene expression for the exons of the MANBA and UBE2D3 gene exons in patient 1 . In the presence of the deletion breakpoint a MANBA-UBE2D3 
fusion gene is formed. (Red) Breakpoint present; (white) no breakpoint present. 



a marked shift in distribution toward more extreme ratios for all 
genes containing a breakpoint. We repeated the same analysis by 
randomly assigning breakpoints to samples. In this case the dis- 
tribution of ratios is the same for genes with and without break- 
points, indicating the specific effects of the breakpoints on gene 
expression (Fig. 4D). These results emphasize that expression mea- 
surements should not be determined only on a per gene basis, 
because subtle intra-gene expression differences due to rearrange- 
ments will be obscured for whole-gene measurements. 

The changes in exon expression for samples with and without 
a deletion breakpoint in the MANBA and UBE2D3 genes (patient 1) 
illustrate the sensitivity of the ratio analysis (Fig. 4E). Exons at the 3' 
end of MANBA are expressed higher in samples with the break- 
point relative to samples without the breakpoint and the reverse is 
true for exon 1. Similar effects were found for the UBE2D3 gene. 
An UBE2D3-MANBA in the frame fusion gene resulted from the 
somatic deletion in the ovary samples and the fusion transcript 
was expressed as verified by RT-PCR (Supplemental Fig. 7). 



We systematically searched for genomic rearrangements that 
are predicted to result in gene fusions and found 28 putative fu- 
sions in patient 1 (Supplemental Table 4). For 12 of the predicted 
rearrangements, we designed PCR primers for RT-PCR across the 
breakpoint junctions and we could confirm expression of seven 
fusion transcripts of which four were in frame and differentially 
expressed in tumor biopsies (Supplemental Fig. 7). Among these is 
one fusion containing the MAST4 kinase. MAST kinases are in- 
volved in recurrent fusions in breast cancer and enhance cell 
proliferation (Robinson et al. 2011). 

Discussion 

We here show that treatment-naive epithelial ovarian cancer, 
whether serous adenocarcinoma or carcinosarcoma, may display ex- 
tensive intrinsic genomic and transcriptomic heterogeneity, leading 
to a broad variety and potentially functional lesion-specific dereg- 
ulation of cellular pathways. The major genomic and transcriptomic 



Genome Research 207 



www.genome.org 



Hoogstraat et al. 



differences were found between distant metastasis located at the 
omentum or peritoneum versus tumor samples at the ovaries and 
pelvis, substantiating previous evidence for intra-tumor hetero- 
geneity in serous epithelial ovarian cancer (Khalique et al. 2007; 
Bashashati et al. 2013). The most striking heterogeneity was found 
for patient 1, where both the mutation and the SV data supported 
two subsets of tumor biopsies. This included two independent 
TP53 mutations, raising the question as to whether two separately 
initiated and evolving tumors had occurred in patient 1, with tis- 
sue from both tumors interwoven at the right ovary. So-called 
collision tumors have been described before and these are marked 
by histologically distinct tumors separated by stroma or basal 
lamina (Bige et al. 2009). Both ovaries are frequently affected in 
serous ovarian cancer (65%) (Cotran et al. 1999) and in rare cases 
this involves tumors with bi-focal origin (Abeln et al. 1995), which 
is a possibility we cannot fully exclude in our case. Both scenarios 
were not obvious from the histopathological examination. In- 
terestingly, patient 1 harbored a germline BRCA2 mutation. BRCA2 
functions in DNA repair, and disruption of BRCA2 leads to geno- 
mic instability (O'Donovan and Livingston 2010). Therefore, this 
mutation could possibly have promoted the very early separation 
of the tumor samples from this patient as opposed to the more 
coherent evolutionary patterns in patients 2 and 3. 

Multi-site profiling of genomic changes allows estimation of 
the evolutionary course of cancer development, including timing 
of mutational events (Gerlinger et al. 2012; Yates and Campbell 
2012). TP 53 mutations were present in all samples from the three 
patients, indicating that these occurred early during tumor evo- 
lution. For patient 1, we observed two subsets of samples, which 
constitute two independent tumors or very early branched sub- 
clones (with independent TP 53 mutations). This evolutionary 
pattern was supported by both mutation data and SV data. Within 
each of the branches we observed much more coherence than 
between the branches: We observed no unique mutations and just 
one unique rearrangement for the omentum and peritoneum 
samples. For the ovary samples of patient 1 we found several 
unique changes. This included 19 private mutations and 48 private 
genomic rearrangements all of which likely occurred late during 
tumor evolution, demonstrating continuous evolution at both the 
structural and mutational level in this branch. Furthermore, we 
observed indications of different mutational mechanisms operat- 
ing in each of the two subsets in patient 1, both at the level of 
genomic rearrangements and point mutations. For patient 3, we 
observed only two private mutations, whereas all other mutations 
(15) were shared between samples. A large overlap between sam- 
ples was also supported by the SV data from patient 3. However, we 
did observe 22 unique rearrangements, suggesting ongoing evo- 
lution at separate sites at the level of genomic rearrangements. 
For patient 2, the SV data showed the presence of two subsets of 
samples. However, a large fraction (34/120) of somatic SVs were 
found in all samples indicating a common evolutionary origin as 
opposed to the very early branching observed for patient 1. The 
common origin and branching in patient 2 was further supported 
by the presence of mutations shared by all samples and a unique 
coding mutation in the ovary/pelvis branch, respectively. In ad- 
dition, we observed ongoing evolution in two pelvis samples based 
on the observation of a condensed cluster of mutations in FANCD2 
coinciding with SVs, a mutational process which has been termed 
"kataegis" (Fig. 2C; Nik-Zainal et al. 2012). These data show that 
kataegis may act only regionally within the tumor of a single pa- 
tient. Similarly, we have previously also observed that chromo- 
thripsis may exclusively occur in either primary, or metastatic tumor 



samples from the same patient (Kloosterman et al. 2011b), dem- 
onstrating that massive mutation mechanisms may occur late during 
tumor development and do not necessarily represent an initiating 
event. Whole-genome sequencing should reveal more single- 
nucleotide changes and provide further insight into possible dif- 
ferences in evolutionary timing relative to SVs in ovarian cancer. 

There is a strong need for improved and targeted therapies for 
ovarian cancer to increase cure rates (Ledermann and Kristeleit 
2010; Vaughan et al. 2011; Banerjee and Kaye 2013). Several tar- 
geted therapies are being tested, but careful selection of patients for 
targeted treatment is essential (Smolle et al. 2013). We show here 
that there is major intra-tumor heterogeneity concerning expres- 
sion of cellular pathways, some of which are candidates for tar- 
geted treatment. For example, overexpression of the Hedgehog 
pathway was observed in a subset of metastases in the omentum 
and pelvic region of patient 2, compared with other pelvic lesions 
and tumor sites in the ovaries. High expression of the Hedgehog 
transcription factor gene GLI1 is associated with poor survival in 
advanced serous ovarian cancer (Ciucci et al. 2013) and Hedgehog 
components are deregulated in various sarcomas, presenting new 
treatment possibilities such as Hedgehog ligand antagonists and 
inhibition of Glil transcription activity (Kelleher et al. 2012). Also, 
we detected strong up-regulation of integrin pathway members 
in peritoneum and omentum metastases in patient 1, as well as 
elevated expression of inflammatory chemokines and cytokines 
primarily found in metastases in the omentum of both patients 
1 and 2. Expression differences in these genes in metastatic lesions 
compared with primary tumors have previously been observed 
(Davidson 2007). These may provide an attractive target for treat- 
ment of ovarian cancer (Mantovani et al. 2008; Vaughan et al. 
2011; Sawada et al. 2012), because the vast majority of patients die 
as a consequence of metastatic disease, while the primary tumor is 
often completely removed during debulking surgery. 

The cancer genome harbors a wide variety of genomic alter- 
ations. Particularly the contribution of structural genomic rear- 
rangements to tumor development is poorly understood. We here 
demonstrate that intra-tumor heterogeneity may involve cancer 
genes disrupted by genomic breakpoints present in only a subset of 
tumor masses. Furthermore, we associated the intra-tumor ex- 
pression differences with genomic rearrangement breakpoints and 
found that effects may range from altered expression due to copy 
number changes of entire genes to very subtle effects involving 
breakpoints affecting only part of a gene, all occurring within a 
single patient. These data show that the effects of genomic rear- 
rangements are profound, contribute to intra-tumor heterogene- 
ity, and may be equally important as coding mutations for tumor 
development. 

Because our study only covered multi-site analysis of three 
ovarian cancer patients, it remains to be seen how representative 
the identified genomic and transcriptomic characteristics will be in 
a larger sample set. Large-scale follow-up studies should be con- 
ducted to determine the rate of extreme intra-tumor heterogeneity 
in ovarian cancer. This aspect of tumor biology requires further 
attention to fully understand escape routes as a response to treat- 
ment and improve survival rates. 

Methods 

Patient sampling and consent 

All patients included are epithelial ovarian cancer FIGO stage III/IV 
patients undergoing primary cytoreductive debulking according to 
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the standard of care (Table 1). During primary debulking, tumor 
samples were obtained from thoroughly documented locations. 
Patients 1 and 2 underwent successful optimal debulking. In pa- 
tient 3, optimal debulking appeared infeasible perioperatively and 
only partial cytoreduction was achieved. She later underwent 
successful cytoreduction after neo-adjuvant chemotherapy. We 
only included the tumor samples obtained from the first debulking 
for patient 3. In addition, DNA from blood and saliva was obtained 
from every patient as control samples (Oragene DNA kit, DNA 
Genotek Inc.). Tumor samples were immediately forwarded from 
the operating room to the pathology department. If feasible, 
multiple core and peripheral tumor samples of each individual 
metastasis or primary tumor were snap-frozen in liquid isopentane 
within 1 h. Tissue was processed into frozen sections and stained 
by Hematoxylin & Eosin (H&E) (Supplemental Fig. 1). A patholo- 
gist reviewed all slides and confirmed tumor type, estimated tumor 
cell percentage, and amount of necrosis (Supplemental Table 1). 
This study was approved by the ethics committee of the UMC 
Utrecht, The Netherlands. Patients could indicate in a specific 
section of the informed consent form that they wanted to be in- 
formed about incidental findings in their germline DNA that could 
affect their health, or the health of their relatives. All patients 
signed informed consent before debulking. 

DNA, RNA isolation 

Fresh frozen samples were homogenized and subsequently split for 
independent DNA and RNA isolation. DNA was isolated using the 
Qiagen Genomic DNA kit (Qiagen). Total RNA was isolated using 
TRIzol reagent (Life Technologies). After isolation, DNA samples 
were stored at -20°C, RNA samples at -80°C. 

Mate-pair sequencing 

Mate-paired libraries were generated from 5 to 10 |xg of DNA iso- 
lated from tumor and control samples using the 5500 SOLiD Mate- 
Paired library kit (Life Technologies). Samples were sheared to 3-kb 
fragments by Hydroshear DNA shearing (Digilab). Per library, 2 X 
50-bp mates were sequenced on a SOLiD 5500x1 or SOLiD WildFire 
instrument. Forward and reverse tags were mapped independently 
(samse) to the reference genome (GRCh37) using BWA software 
and settings -c -1 25 -k 2 -n 10 (Li and Durbin 2009). Discordant 
reads were clustered using in-house software as described pre- 
viously (Kloosterman et al. 2011a). The software is available from 
https://github.com/Vityay/l-2-3-SV. In a first step, we estimated 
the insert size distribution and location of discordant mate-pairs. 
This was done separately for each sample. Furthermore, PCR du- 
plicates, reads with mapping quality 0, and nonuniquely mapped 
reads were removed from further analysis. As a second step clus- 
tering of discordant pairs was done for all samples from each pa- 
tient together. Two pairs are considered to belong to the same 
cluster when the distance between coordinates of their 5' tags to- 
gether with the distance between 3' tags does not exceed the me- 
dian distance of the library with the largest insert size. The search is 
continued until no clusters with at least five clones in at least one 
of the samples can be found. As analysis of discordant read pairs 
does not give exact breakpoints of structural variants, the output 
lists genome segments containing each breakpoint along with 
information about the source of the discordant pairs (samples) and 
the properties of the cluster as outlined in Supplemental Table 4. 
The orientation of the different mate-pair tags in a cluster relative 
to each other is indicated by H (or h for the minus strand) when the 
tag has its "head" side (the side that points toward the start of the 
chromosome) opposed to the pairing tag and T (or t for the minus 
strand) when a tag has its "tail" side (the side that points toward the 



end of the chromosome) opposed to the pairing tag. The clustering 
results in calling of intrachromosomal rearrangements (deletion 
type, inverted, tandem duplication type) and interchromosomal 
rearrangements. To select for somatic variants, all genomic rear- 
rangement breakpoints were filtered for normal tissue samples 
(blood, muscle, tuba) and an in-house database of mate-pair se- 
quencing data from healthy individuals. To achieve high-quality 
calling of somatic structural variants, we required at least five in- 
dependent discordant sequence reads derived from at least one 
tumor sample (Kloosterman et al. 2011a,b). For all breakpoint calls 
consistent with these criteria, presence of the breakpoint in other 
samples from the same patient was determined based on presence 
of at least one overlapping discordant read pair with the same 
orientation. 

Primers for PCR confirmation of somatic breakpoints were 
designed based on mate-pair sequencing data. PCRs were performed 
under standard conditions as described before (Kloosterman et al. 
2011a). 

Cancer gene resequencing 

SNVs and indels were detected by targeted sequencing of a total of 
2099 cancer genes. First, samples were interrogated by a designed 
"Cancer mini-genome" consisting of 1977 cancer genes. Barcoded 
fragment libraries were generated from 2 |xg of isolated DNA from 
tumor and control samples as previously described (Harakalova 
et al. 2011). Pools of libraries were enriched for 1977 cancer-related 
genes (Cancer mini-genome [Vermaat et al. 2012]) using SureSelect 
technology. Enriched libraries were sequenced on a SOLiD 5500x1 
or SOLiD WildFire instrument according to the manufacturers' 
protocol. Furthermore, the exons within a subset of 409 oncogenes 
and tumor suppressor genes were interrogated by the Ion Ampliseq 
Comprehensive Cancer Panel (Life Technologies). Libraries were 
constructed from 40 ng of isolated DNA for each sample using 
standard AmpliSeq procedures. Barcoded libraries were pooled and 
sequenced on the Ion PGM Sequencer (Life Technologies). 

Ion Torrent reads were aligned to the human reference ge- 
nome version 19 (GRCh37) using Tmap. Variant calling on Ion 
Torrent data was performed using Strelka (Saunders et al. 2012). 
SOLiD reads were mapped on the same genome version, using 
BWA (-c -1 25 -k 2 -n 10) and variant calling was done using a 
custom pipeline identifying variants with at least 10 X coverage, a 
15% allele frequency, and multiple (>2) occurrences in the seed 
(the first 25 -bp most accurately mapped part of the read) as well as 
support from independent reads (>3). All variant positions iden- 
tified in either SOLiD or Ion Torrent data were subsequently geno- 
typed in the raw data sets of both techniques for all samples using 
samtools mpileup, to ensure the presence or absence of possible 
low-frequency variants. Validation of single-nucleotide mutations 
and indels was performed by PCR amplification of mutation loci 
followed by Nextera XT library prep and sequencing on MiSeq 
(Illumina). 

SNP-array analysis 

For each sample, 200 ng DNA was used as input for copy number 
profiling using Cytol2 SNP arrays according to standard pro- 
cedures (Illumina). Genomic events were identified by applying 
ASCAT processing (Allele Specific Copy Number Analysis of Tu- 
mors) with Nexus Copy Number 6.0 (BioDiscovery). Briefly, all 
signals in tumor samples similar to those in the provided reference 
sample are excluded from analysis, increasing specificity in detect- 
ing additional events in tumor samples. 

Clustering of SNP array data was done by calculating a Eu- 
clidean distance matrix based on B-allele frequencies of all SNP 
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positions showing a heterozygous genotype in the reference sam- 
ples (allele frequency 20%-80%), and performing hierarchical 
clustering on these data using standard R functions. 

RNA sequencing 

Total RNA was processed with the Poly(A) Purist Kit (Life 
Technologies) to select for poly(A) + RNA. Next, the mRNA- 
ONLY Eukaryotic mRNA Isolation kit (Illumina) was used to 
select for 5' capped mRNA. Paired-end libraries were con- 
structed from 8-30 u,g total RNA per sample, using the SOLiD 
total RNA-seq kit (Life Technologies). Libraries were barcoded 
and sequenced on a SOLiD 5500x1 instrument in paired-end 
mode (50 X 35 bp). Forward (F3) and reverse (F5) reads were 
mapped independently to the human reference genome 
(GRCh3 7) using BWA (-c -1 25 -k 2 -n 10) (Li and Durbin 2009). 
Coverage per gene was determined by adding up read counts of 
all coding regions as determined by BEDTools 2.16.2 multicov 
(Quinlan and Hall 2010). 

Because of the lack of a true reference sample, i.e., healthy 
tissue from the ovaries of each patient, we compared each sample 
separately against the pool of all other samples of the same patient 
and determined gene expression differences within patients using 
the DEGseq R package (Wang et al. 2010). To identify the most 
significant expression differences, DEGseq output was first cor- 
rected for multiple testing by selecting genes with a P-value < 
4.6399 X 10" 8 (0.001 divided by 21552 [the number of tested 
genes]) and extracting the top 5% and bottom 5% of normalized 
log ratios. Normalized coverages of the genes emerging from this 
analysis were used to cluster samples based on Poisson Mixture 
Models generated with the HTSCluster R package (Rau et al. 2011). 
Pathway and GO biological process enrichment of up-regulated 
genes within the thus created clusters was determined by com- 
paring the core samples in these clusters to a pool of all reference 
samples from our three patients and finally to each other by DEGseq 
and selecting the top 5% genes as described above. Resulting gene 
lists were analyzed through the Reactome FI Cytoscape Plugin 
(Shannon et al. 2003), and pathways or processes containing at 
least two up-regulated genes and a false discovery rate (FDR) <0.1 
were reported as being affected. We performed molecular sub- 
typing of samples by calculating the percentage of concordance 
of up- and down-regulated genes (compared with the common 
reference pool) with the profiles presented by Tothill and col- 
leagues (Tothill et al. 2008; The Cancer Genome Atlas Research 
Network 2011). 

Data access 

The SNP-array data have been submitted to the NCBI Gene Ex- 
pression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) un- 
der accession number GSE47633. The sequencing data have been 
submitted to the European Nucleotide Archive (ENA; http://www. 
ebi.ac.uk/ena/) under accession number ERP003455. 
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